Skip to content 🎉 Announcing our Unstructured Data Monitoring Product and Series B Extension
Blog

Chapter 2 Preview of the Anomalo & O’Reilly Data Quality Guide

We’re thrilled to be working with O’Reilly on a book to help organizations discover new solutions for detecting and resolving data quality issues.

After releasing Chapter 1 last month, today, we’re making Chapter 2 available for free. Chapter 2, “Data Quality Monitoring Strategies and the Role of Automation,” is a comprehensive look at the historical approaches to data quality, and how the rise of machine learning is leading to exciting new ways to monitor data at scale.

Click here to download Chapters 1 and 2 for free (https://www.anomalo.com/asset/automating-data-quality-monitoring-scaling-beyond-rules-with-machine-learning/)

The full book will be published by O’Reilly Media later this year. We’re giving away more chapters in the coming months, so follow us on our blog and LinkedIn for updates!

About the book

Automating Data Quality Monitoring at Scale is based on everything we’ve learned from building Anomalo. Chapter 1, “The Data Quality Imperative,” sets the stage by explaining why a business should care about data quality today. In Chapter 2, we discuss why existing strategies don’t scale to large amounts of data, and propose a new approach. You’ll learn:

  • What data quality monitoring should accomplish, and why it must go beyond basic observability
  • The pros and cons of the three main historical approaches:
    – Manual checks
    – Rule-based testing
    – Metrics monitoring
  • How machine learning can automate data quality monitoring while reducing false positives and alert fatigue
  • The drawbacks of relying exclusively on machine learning
  • How combining human expertise and automation yields a best-of-all-worlds approach

…and much more.

In future chapters, we’ll also share:

  • How to apply unsupervised machine learning models for detecting data issues
  • How to implement notifications while avoiding alert fatigue
  • How to integrate data quality monitoring with data catalogs, orchestration layers, and other systems
  • How to deploy, manage, and maintain your monitoring solution

Note that the contents of this preview will almost certainly change as we continue to craft the book and get feedback from early readers. If you have ideas to share or notice missing content, please let our editorial team know by reaching out to gobrien@oreilly.com.

Get Started

Meet with our expert team and learn how Anomalo can help you achieve high data quality with less effort.

Request a Demo